Learning Question Paraphrases for QA from Encarta Logs

نویسندگان

  • Shiqi Zhao
  • Ming Zhou
  • Ting Liu
چکیده

Question paraphrasing is critical in many Natural Language Processing (NLP) applications, especially for question reformulation in question answering (QA). However, choosing an appropriate data source and developing effective methods are challenging tasks. In this paper, we propose a method that exploits Encarta logs to automatically identify question paraphrases and extract templates. Questions from Encarta logs are partitioned into small clusters, within which a perceptron classier is used for identifying question paraphrases. Experiments are conducted and the results have shown: (1) Encarta log data is an eligible data source for question paraphrasing and the user clicks in the data are indicative clues for recognizing paraphrases; (2) the supervised method we present is effective, which can evidently outperform the unsupervised method. Besides, the features introduced to identify paraphrases are sound; (3) the obtained question paraphrase templates are quite effective in question reformulation, enhancing the MRR from 0.2761 to 0.4939 with the questions of TREC QA 2003.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Rank Effective Paraphrases from Query Logs for Community Question Answering

We present a novel method for ranking query paraphrases for effective search in community question answering (cQA). The method uses query logs from Yahoo! Search and Yahoo! Answers for automatically extracting a corpus of paraphrases of queries and questions using the query-question click history. Elements of this corpus are automatically ranked according to recall and mean reciprocal rank, and...

متن کامل

Learning to Paraphrase for Question Answering

Question answering (QA) systems are sensitive to the many different ways natural language expresses the same information need. In this paper we turn to paraphrases as a means of capturing this knowledge and present a general framework which learns felicitous paraphrases for various QA tasks. Our method is trained end-toend using question-answer pairs as a supervision signal. A question and its ...

متن کامل

Learning Paraphrases to Improve a Question-Answering System

In this paper, we present a nearly unsupervised learning methodology for automatically extracting paraphrases from the Web. Starting with one single linguistic expression of a semantic relationship, our learning algorithm repeatedly samples the Web, in order to build a corpus of potential new examples of the same relationship. Sampling steps alternate with validation steps, during which implaus...

متن کامل

Question Paraphrase Generation for Question Answering System

The queries to a practical Question Answering (QA) system range from keywords, phrases, badly written questions, and occasionally grammatically perfect questions. Among different kinds of question analysis approaches, the pattern matching works well in analyzing such queries. It is costly to build this pattern matching module because tremendous manual labor is needed to expand its coverage to s...

متن کامل

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

We frame Question Answering (QA) as a Reinforcement Learning task, an approach that we call Active Question Answering. We propose an agent that sits between the user and a black box QA system and learns to reformulate questions to elicit the best possible answers. The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the return...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007